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A Random Variable Substitution Lemma With 
Applications to Multiple Description Coding 

Jia Wang, Jun Chen, Lei Zhao, Paul Cuff, and Haim Permuter 

Abstract 

We establish a random variable substitution lemma and use it to investigate the role of refinement layer in 
multiple description coding, which clarifies the relationship among several existing achievable multiple description 
rate-distortion regions. Specifically, it is shown that the El Gamal-Cover (EGC) region is equivalent to the EGC* 
region (an antecedent version of the EGC region) while the Venkataramani-Kramer-Goyal (VKG) region (when 
specialized to the 2-description case) is equivalent to the Zhang-Berger (ZB) region. Moreover, we prove that for 
multiple description coding with individual and hierarchical distortion constraints, the number of layers in the VKG 
scheme can be significantly reduced when only certain weighted sum rates are concerned. The role of refinement 
layer in scalable coding (a special case of multiple description coding) is also studied. 

Index Terms 

Contra-polymatroid, multiple description coding, rate-distortion region, scalable coding, successive refinement, 

I. Introduction 

A fundamental problem of multiple description coding is to characterize the rate-distortion region, which is the 
set of all achievable rate-distortion tuples. El Gamal and Cover (EGC) obtained an inner bound of the 2-description 
rate-distortion region, which is shown to be tight for the no excess rate case by Ahlswede [1]. Zhang and Berger 
(ZB) [23] derived a different inner bound of the 2-description rate-distortion region and showed that it contains rate- 
distortion tuples not included in the EGC region. The EGC region has an antecedent version, which is sometimes 
referred to as the EGC* region. The EGC* region was shown to be tight for the quadratic Gaussian case by Ozarow 
[13]. However, the EGC* region has been largely abandoned in view of the fact that it is contained in the EGC region 
[23]. Other work on the 2-description problem can be found in [8], [9], [12], [24]. Recent years have seen growth 
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of interest in the general i-description problem [14], [15], [18], [19], [21]. In particular, Venkataramani, Kramer, 
and Goyal (VKG) [21] derived an inner bound of the L-description rate-distortion region. It is well understood that 
for the 2-description case both the EGC region and the ZB region subsume the EGC* region while all these three 
regions are contained in the VKG region; moreover, the reason that one region contains another is simply because 
more layers are used. Indeed, the ZB scheme has one more common description layer than the EGC* scheme 
while the EGC scheme and the VKG scheme have one more refinement layer than the EGC* scheme and the ZB 
scheme, respectively. Although it is known [23] that the EGC* scheme can be strictly improved via the inclusion 
of a common description layer, it is still unclear whether the refinement layer has the same effect. We shall show 
that in fact the EGC region is equivalent to the EGC* region and the VKG region is equivalent to the ZB region; 
as a consequence, the refinement layer can be safely removed. 

An important special case of the 2-description problem is called scalable coding, also known as successive 
refinement^ The rate-distortion region of scalable coding has been characterized by Koshelev [10] [11], Equitz 
and Cover [6] for the no rate loss case and by Rimoldi [16] for the general case. In scalable coding, the second 
description is not required to reconstruct the source; instead, it serves as a refinement layer to improve the first 
description. However, it is clearly of interest to know whether the refinement layer itself in an optimal scalable 
coding scheme can be useful, i.e., whether one can achieve a nontrivial distortion using the refinement layer alone. 
This problem is closely related, but not identical, to multiple description coding with no excess rate. 

To the end of understanding the role of refinement layer in multiple description coding as well as scalable coding, 
we need the following random variable substitution lemma. 

Lemma 1: Let U, V, and W be jointly distributed random variables taking values in finite sets U, V, and W, 
respectively. There exist a random variable Z, taking values in a finite set Z with \Z\ < |V||yV|-l, and a function 
f iVxZ^W such that 

1) Z is independent of V; 

2) w^fiv,zy, 

3) U - (V, W)-Z form a Markov chain. 

The proof of Lemma 1 is given in Appendix 1. Roughly speaking, this lemma states that one can remove random 
variable W by introducing random variable Z and deterministic function /. It will be seen in the context of multiple 
description coding that Z can be incorporated into other random variables due to its special property, which results 
in a reduction of the number of random variables. 

The remainder of this paper is devoted to the applications of the random variable substitution lemma to multiple 
description coding and scalable coding. In Section II, we show that the EGC region is equivalent to the EGC* 
region and the ZB region includes the EGC region. We examine the general L-description problem in Section III. 
It is shown that the final refinement layer in the VKG scheme can be removed. This result implies that the VKG 
region, when speciaUzed to the 2-description case, is equivalent to the ZB region. Furthermore, we prove that for 

'The notion of successive refinement is sometimes used in the more restrictive no rate loss scenario. 
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multiple description coding with individual and hierarchical distortion constraints, the number of layers in the VKG 
scheme can be significantly reduced when only certain weighted sum rates are concerned. We study scalable coding 
with an emphasis on the role of refinement layer in Section IV. Section V contains some concluding remarks. 

II. Applications to the 2-description case 

We shall first give a formal definition of the multiple description rate-distortion region. Let {X{t)}^i be an 
i.i.d. process with marginal distribution px on X, and d : X x X [0, oo) be a distortion measure, where X and 
X are finite sets. Define Xl = {!,••• > ^} for any positive integer L. 

Definition 1: A rate-distortion tuple {Ri, ■ ■ ■ ,Rl,Dk,9 C /C C £) is said to be achievable if for any e > 0, 
there exist encoding functions /^^"^ : A*" — > Cj^\ k G Tl, and decoding functions g'£^ : Ilfce/c'^i"^ ~^ 
C /C C Jj,, such that 

ilog|4"^| <i?fc + e, kelL, 
1 " 

-J2Hd{X{t),X!c{t))]<D,c + e, 0C/CCJi, 
^ t=i 

for all sufficiently large n, where = g^£\f^\x^),k e K,). The multiple description rate-distortion region 
'R^'Dyto is the set of all achievable rate-distortion tuples. 

We shall focus on the 2-description case (i.e., L = 2) in this section. The following two irmer bounds of T^-Pmo 
are attributed to El Gamal and Cover. 

The EGC* region T^-Degc* is the convex closure of the set of quintuples {Ri,R2, -D{i}, -D{2}> ^{1,2}) for which 
there exist auxiliary random variables ^{1} and X^2}> jointly distributed with X, and functions <?f>A:, C /C C {1, 2}, 
such that 

Rk>I{X-X[k}), fee {1,2}, 

Ri+R2> I{X; X{2}) + X|2}), 
D[k} > nd{X, (/.{i} {X[iy))], ke{l, 2}, 

£'{1,2} > E[d(X,</.{i,2}(X{i},X{2}))]. 

The EGC region T^IJegc is the convex closure of the set of quintuples i?2, -D{i}, I?{2}, -D{i^2}) for which 
there exist auxiliary random variables Xjc, C /C C {1, 2}, jointly distributed with X, such that 

Rk>I{X;X[k}), A;G{1,2}, (1) 
Ri+R2> /(X;X{i},X{2},X{i,2}) +/(X{i};X{2}), (2) 
> E[d{X, Xk)], C /C C {1, 2}. (3) 

To see the connection between these two inner bounds, we shall write the EGC region in an alternative form. It can 
be verified that the EGC region is equivalent to the set of quintuples (it!i, i?2, -D{2}, ^{1,2}) for which there 
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exist auxiliary random variables X^., C /C C {1, 2}, jointly distributed with X, and functions (/>a:, C /C C {1, 2}, 
such that 

Rk>I{X-X{,,}), fee {1,2}, 

Ri+ R2> /(X;X{i},X{2},-'^^{i,2}) + liXi-iYjX^Q,}), 
D{k} > E[d(X, A: e {1, 2}, 

-D{1,2} > E[(i(X, 0{l_2}(^{l},^{2},-'^^{l,2}))]• 
It is easy to see from this alternative form of the EGC region that the only difference from the EGC* region is the 
additional random variable -'^{i,2}> which corresponds to a refinement layer; by setting -^^{1,2} to be constant (i.e, 
removing the refinement layer), we recover the EGC* region. Therefore, the EGC* region is contained in the EGC 
region. It is natural to ask whether the refinement layer leads to a strict improvement. The answer turns out to be 
negative as shown by the following theorem, which states that the two regions are in fact equivalent. 
Theorem 1: TZT>bgc* = TZVbgc- 

Proof: In view of the fact that 71I>egc* C T^Degc, it suffices to prove T^Decjc C TZVbgc*- 
For any fixed PxxiiyX[2}X[i 2}' the region specified by (l)-(3) has two vertices 

vi : {Ri{vi), R2{vi), D[iy{vi), D[2}{vi), D[i^2}ivi)), 

V2 : {Rl{v2),R2{v2),D{iy{v2),D^2}{v2),D{i^2}{v2)), 

where 

Rr{vi) = I{X;X^^y), 

R2{vi) = I{X:X[2},X[i2}\X[i-y) + I{X[iy,X[2}), 
R,{V2) = I{X; X^i,2}|^{2}) + X^2}), 
R2{V2) = I{X;X[2}), 

Dk{vi) = Dk{v2) = nd{X, Xk)], C /C C {1, 2}. 

We just need to show that both vertices are contained in the EGC* region. By symmetry, we shall only consider 
vertex vi. 

It follows from Lemma 1 that there exist a random variable Z, jointly distributed with (X, Xjij, X|2}, -^^^{1,2}). 
and a function / such that 

1) Z is independent of (X{i},X{2}); 

2) -'^{1,2} = ./(-'^{i},-'^{2}, 

?>) X — (X|ij,X^2})^{i,2}) — Z form a Markov chain. 
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By the fact that X — (Xjij, X|2}, -^{1,2}) ~ ^ form a Markov chain and that -^^{1,2} is a deterministic function of 

(X|i},X|2},^), we have 

/(X;X{2},-'^^{1,2}|-^{1}) = -^(-'^;-''^{2},-''^{i,2},-^|-^{i}) 
Moreover, since Z is independent of (X|i},Xj2}), it follows that 

-'^(-''^{i};^{2}) = i{^{\)\^{i}-,'Z'). 

By setting -^^{2} ~ (-^{2}) ■^)> we can rewrite the coordinates of v\ as 

i?2(t;i) = /(X; X|2}) + X|2}), 
D{i}(t;i)=E[d(X,.^{i}(X{i}))], 

D{2}(t;i)=E[rf(X,0{2}(Xi2}))], 
D{l,2}(^^l)=E[d(^,0{l,2}(^{l},^(2}))], 

where = X^i}, <?i{2}(^{2}) = ^{2}. and 0{i_2}(X{i},Xj2}) = /(X{i},X{2}, = -''^{i,2}- Therefore, 

it is clear that vertex v\ is contained in the EGC* region. The proof is complete. ■ 

Remark: It is worth noting that the proof of Theorem 1 implicitly provides cardinality bounds for the auxiliary 
random variables of the EGC* region. 

Now we shall proceed to discuss the ZB region, which is also an inner bound of T^'Dmd- The ZB region TZVzb 
is the set of quintuples -R2, -D^i}, Z){2}) ^{1,2}) for which there exist auxiliary random variables X^, X^iy, 
and -^{2}. jointly distributed with X, and functions (/);C) C /C C {1, 2}, such that 

Rk>IiX;X^,X[k}), fee {1,2}, 

i?i + i?2 > 2I{X; X^) + I{X- X{2}|X0) + /(X^ij; X{2}|X0), 
D{k} > E[diX,4>[k}{X0,X[ky))], k € {1,2}, 

£»{1,2} > ]E[d(X,.^{i,2}(^0,^{l},^{2}))]- 

Note that the ZB region is a convex set. It is easy to see from the definition of the ZB region that its only difference 
from the EGC* region is the additional random variable X0, which corresponds to a conmion description layer; by 
setting Xg to be constant (i.e., removing the common description layer), we recover the EGC* region. Therefore, 
the EGC* region is contained in the ZB region, and the following result is an immediate consequence of Theorem 
1. 

Corollary 1: TZVegc C TZVzb. 

Remark: Since the ZB region contains rate-distortion tuples not in the EGC region as shown in [23], the inclusion 
can be strict. 
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III. Applications to the L-description case 

The general L-description problem turns out to be considerably more complex than the 2-description case. The 
difficulty might be attributed to the following fact. For any two non-empty subsets of {1,2}, either one contains 
the other or they are disjoint; however, this is not true for subsets of 2l when L > 2. Indeed, this tree structure 
of distortion constraints is a fundamental feature that distinguishes the 2-description problem from the general 
L-description problem. 

The VKG region [21], which is a natural combination and extension of the EGC region and the ZB region, is an 
inner bound of the L-description rate-distortion region. We shall show that the final refinement layer in the VKG 
scheme is dispensable, which implies that the VKG region, when specialized to the 2-description case, coincides 
with the ZB region. We formulate the problem of multiple description coding with individual and hierarchical 
distortion constraints, which is a special case of tree-structured distortion constraints, and show that in this setting 
the number of layers in the VKG scheme can be significantly reduced when only certain weighted sum rates are 
concerned. It is worth noting that the VKG scheme is not the only scheme known for the L-description problem. 
Indeed, there are several other schemes in the literature [14], [15], [18] which can outperform the VKG scheme in 
certain scenarios where the distortion constraints do no exhibit a tree structure. However, the VKG scheme remains 
to be the most natural one for tree-structured distortion constraints. 

We shall adopt the notation in [21]. For any set A, let 2-^ be the power set of A. Given a collection of sets B, 
we define ^(/s) = {X_4, : A <E B}. Note that (which is a random variable) should not be confused with X(0) 
(which is interpreted as a constant). We use Ric to denote '^^keK.-^k for C /C C Z^. 

The VKG region TZV^yjc is the set of rate-distortion tuples (i?i, • • • ,_Ri, 13^,0 C /C C I^) for which there 
exist auxiliary random variables X^., ICQIl, jointly distributed with X, and functions <f)ic,^ C /C C 1^, such that 

Ric>H^), DcJCCIl, (4) 
Dk > E[dK{X,<PKiX^2^)))], DcICCIl, (5) 

where 

^(/C) = (|/C| - 1)I{X;X^) - H(X(2^)|X) + ^ 

ACIC 

Note that the VKG region is a convex set. In fact, reference [21] contains a weak version and a strong version of 
the VKG region, and the one given here is in a shghtly different form from those in [21]. Specifically, one can get 
the weak version in [21] by replacing (5) with Djc > ¥.[dic{X, X)c)], and get the strong version in [21] by replacing 
(5) with D/c > E,[d)c{X, (pjc{Xjc))]. It is easy to verify that the strong version is equivalent to the one given here 
while both of them are at least as large as the weak version; moreover, all these three versions are equivalent when 
L = 2. 

We shall first give a structural characterization of the VKG region. 

Lemma 2: For any fixed Pxx^^x^^, the rate region {{Ri, ■ ■ ■ ,Rl) '■ Rk > V'(^))0 C /C C 2^} is a contra- 
polymatroid. 
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Proof: See Appendix II. ■ 
Note that the random variable Xx^ corresponds to the final refinement layer in the VKG scheme. Now we 
proceed to show that this refinement layer can be removed. Define the VKG* region T^Pvkg* as the VKG region 
with Xij^ set to be a constant. 
Theorem 2: T^Pvkg* = 'TIT^vkg- 

Proof: The proof is given in Appendix III. ■ 
A direct consequence of Theorem 2 is that the VKG region, when specialized to the 2-description case, is 
equivalent to the ZB region. 

Corollary 2: For the 2-description problem, TlVza = T^Pvkg- 

Remark: For the 2-description VKG region, the cardinality bound for can be derived by invoking the supporting 
lemma [4J while all the other auxiliary random variables can be assumed, with no loss of generality, to be defined 
on the reconstruction alphabet X. Therefore, one can deduce cardinality bounds for the auxiliary random variables 
of the ZB region by leveraging Corollary 2. 

We can see that for the VKG* region, the number of auxiliary random variables is exactly the same as the 
number of distortion constraints. Intuitively, the number of auxiliary random variables can be further reduced if 
we remove certain distortion constraints. Somewhat surprisingly, we shall show that in some cases the number of 
auxiliary random variables can be significantly less than the number of distortion constraints. 

For any nonnegative integer k, define Hk = if /c = 0, Hk = {{1}} if fc = 1, and Hk = {{1}, ■ • • . {k}-l2- ■ ■ ■ ,Ik} 
if k > 2. Multiple description coding with individual and hierachical distortion constraints (see Fig. 1) refers to 
the scenario where only the following distortion constraints: Dfc, IC E Hl, are imposed. Specializing the VKG 
region to this setting, we can define the VKG region for multiple description coding with individual and hierachical 
distortion constraints TZViu-vkg as the set of rate-distortion tuples (i?i, • • ■ ,Rl,D)c,IC e Hl) for which there 
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exist auxiliary random variables X/c, K, C J^,, jointly distributed with X, and functions 4>ic, /C G Hl, such that 

D,c > ndK{X, .^k:(^(2^)))], /C g Wi. 

Define 7eiH-vKG(-Dx;,/C e T^l) = ,Rl) ■ {Ri,--- ,Rl,Dk.,IC G Hl) G TIPih.vkg}. It is observed in [3] 

that for the quadratic Gaussian case, the number of auxiliary random variables can be significantly reduced when 
only certain supporting hyperplanes of '^ih-vkg (^k: ) /C G Wl) are concerned. We shall show that this phenomenon 
is not restricted to the quadratic Gaussian case. 
Theorem 3: For any ai > ■ ■ - ul > 0, we have 

L 

min UkRk 

(Ri,- ,RL)enm-vKaiDK,ICsnL) 

k—1 
L 

where the minimization in (6) is over Px0X{i} and G Hl, subject to the constraints 

D[k} > E[d{x,(t>^k}{X0,x^k}))], keiL, 

£>i, >E[d(X,<^x,(X0,X{i},... ,X{k}))l fcG Ji-{1}. 
Proof: The proof of Theorem 3 is given in Appendix IV. ■ 
Corollary 3: For any ai > ■ ■ - aL > 0, we have 

L 

min akRk 
(fli,-- ,RL)enm-vKo{DK:,ICeHL) 

L 

= min Eafe[/(X;X0) + /(X(„,_^);X^fej|X0)+/(X;X{fej,XxjX0,X(„,_^))], (7) 
where the minimization in (7) is over PxiiiX^-nj^)\x subject to the constraints 

DK>E[d{X,X^)], K&Hl. 
Proof: See Appendix V. ■ 

Remark: It should be noted that X)c, K, G Hl, in (7) are defined on the reconstruction alphabet X; moreover, for 
X^j in (7), the cardinality bound can be easily derived by invoking the support lemma [4]. In view of the proof of 
Corollary 3, one can derive cardinality bounds for the auxiliary random variables in (6) by leveraging the cardinality 
bounds for the auxiliary random variables in (7). This explains why "min" instead of "inf" is used in (6). 

A special case of multiple description coding with individual and hierachical distortion constraints is called multi- 
ple description coding with individual and central distortion constraints [3], [22], where only the individual distortion 
constraints -D{fe}, k G II, and the central distortion constraint are imposed. Let Ql = {{I}, ■ • • > {L},'Il}- 
We can define the VKG region for multiple description coding with individual and central distortion constraints 
T^-Dic-vKG as the set of rate-distortion tuples (.Ri, ■ • • , Rl, Djc, K, G Gl) for which there exist auxiliary random 
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variables Xfc, JC C 1^, jointly distributed with X, and functions 4>ic, ^ G Ql, such that 

Define TZic-vKciDK, IC e Gl) = {{Ri, ■ ■ ■ , Rl) ■ {Rir ■ ■ , Rl, DicIC G Gl) & HVic-nkg}. The following result 
is a simple consequence of Theorem 3 and CoroUary 3. 

Corollary 4: T^-I'ic-vkg is equivalent to the set of rate-distortion tuples • • • ,i?i,£>/c,/C € Ql) for which 
there exist auxiliary random variables Xg, X^^j^y, k G II, jointly distributed with X, and functions (p;c^ ^ G Ql, 
such that 

Rk > \1C\I{X-X^) - H{{X{u}}keK\X,X^) + C /C C 1^, 

D^k} >nd{X,<t>{k}{X^,X^k}))l k€lL, 
Di,>E[d{X,cl)i,{X^,X^^y,-.- ,x^L}m 

T^-Pic-vKG is also equivalent to the set of rate-distortion tuples (.Ri, ■ • • , Rl, Djc, K, G Ql) for which there exist 
auxiliary random variables X^, Xjc, fC € Ql, jointly distributed with X, and functions (j)K, ^ & Ql, such that 

Rk > \IC\IiX;X^) - H{{X^k}}keJc\X,X^) + H^X^^ylX^), c /C cIl, 

keic 

L 

Ril > LI{X; Xg) - i?({X{fe}}fegi^|X, X0) + ^if(Xji;j.|X0) + I{X]Xi^\X(t,, {-'^{fe}}fceit), 

fe=i 

Dk>W{X,Xk.)], ICgQl. 
Moreover, for any (ai, • • • , ul) G K+, let tt be a permutation on II such that a7r(i) > • • • > ctniL)', we have 

L 

min afciJfe 

(iii,--- ,-Rt)e7?.ic-vKG(£'K,/ceai,) 

K=l 
I, 

= min Y,a^^k)[I{X-,X0)+I{X,{X^^i)}^ll-,X[^^k)}\X^)]+aniL)I{X-,^ (9) 
where the minimization in (8) is over PxmX^iy-x^Lylx, ^iid (pjc, ^ & Ql, subject to the constraints 

D^k} > E[d{X,(t>{,,yiX^,X[k}))], k&lL, 
Di, > E[d(X, <^iJX0, ■ • • , 

while the minimization in (9) is over pxg,x^g^)\x subject to the constraints 

DK>nd{X,XK)], ICgQl. 
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IV. Applications to Scalable Coding 
Scalable coding is a special case of the 2-description problem in which the distortion constraint on the second 
description, i.e., £'{2}> is not imposed. The scalable coding rate-distortion region TZVsc is defined as 

7ePsc = {(i?i,i?2, £>{!}, -D{i,2}) : i?2, £>{!}, oo, £>{i,2}) eTe^Mo}. 

It is proved in [16] that the quadruple (iii, R2, -D{i}, £'{1,2}) G T^-X'sc if and only if there exist auxiliary random 
variables -'^{i} and -'^{1,2} jointly distributed with X such that 

R,>I{x■,x^^y), 

Rl+R2>I{X;X{iy,X^i^2}), 
D^,y>E[d{X,X^,y)], 

D{i,2} >E[d(X,X|i,2})]. 

It is clear that one can obtain TZV^c from TZV^qq by setting ^{2} to be a constant. 

Since the EGC region is equivalent to the EGC* region, it is not surprising that TZT^sc can be written in an 
alternative form which resembles the EGC* region. By Lemma 1, there exist a random variable ^{2}> jointly 
distributed with {X,X^iy,X[i^2}), and a function /, such that 

1) ^{2} is independent of -^{i}; 

2) -''^{1,2} = /(^{i},-'^{2}); 

3) X — (-'^{i}, -^^{1,2}) — ^{2} form a Markov chain. 

Therefore, TZVsc can be written as the set of quadruples (iii, i?2, ^{i}) -D{i,2}) for which there exist independent 
random variables -^{1} and -^{2}. jointly distributed with X, and a function /, such that 

i?i >/(X;X{i}), 

Rl + R2>I{X;X^iy,X{2}), 
D^^y>E[d{X,X^,y)], 

£'{i,2}>E[rf(X,/(X{i},X{2}))]. 

It is somewhat interesting to note that a direct verification of the fact that this alternative form of TZV^c is equivalent 
to the EGC* region without constraint -D{2} is not completely straightforward. 

Since £'{2} is not imposed in scalable coding, the second description essentially plays the role of a refinement 
layer. It is natural to ask whether the refinement layer itself can be useful, i.e., whether one can use the refinement 
layer alone to achieve a non-trivial reconstruction distortion. However, without further constraint, this problem is 
essentially the same as the multiple description problem. Therefore, we shall focus on the following special case. 
Define the minimum scalably achievable total rate D{i}, £'{1^2}) with respect to (i?i, Djij, D{i_2}) as 

R{Rx, £>{i,2}) = min{i?i + R2 : {Ri,R2, £>{i}, £'{1,2}) e 7^I?sc}• 
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It is clear that [16] 

RiRuD^,y,D^,^2})= ,(,min^^^ J(X;X{i},X{i,2}). 

E[d(X,X^-ij)l<D^ij 

Let Q denote the convex closure of the set of quintuples {Ri,R2, -D{i}, -D{2}, -D{i,2}) for which there exist auxiliary 
random variables Xjc, C /C C {1, 2}, jointly distributed with X, such that 

7(X{i};X|2})=0, 

Rk>I{X;X^k}), k&{l,2}, 

R1+R2 > -'^(^;-'^{i},-'^{2})-'^{i,2})) 
D^>E[d{X,X,c)], 0CX;C{1,2}. 

Note that Q is essentially the EGC region with an addition constraint I{X^ij;X^2}) = (i.e., ^{1} and -^^{2} are 
independent). 

Lemma 3: The EGC region is tight if Ri + R2 = -R(-Ri, -D{i}, -D{i,2}); more precisely, 

{(-Ri, R2, -D{2}) -D{i,2}) G T^'^MD : Ri + R2 = R{Ri,D^iy,D^i 2})} 

= {{Ri,R2, £>{!}, £>{2}, £'{1,2}) &Q:Ri + R2 = R{Rl,D^^, £'{1,2})}. 
Proof: It is worth noting that this problem is not identical to multiple description coding without excess rate. 
Nevertheless, Ahlswede's proof technique [1] (also cf. [20]) can be directly applied here with no essential change. 
The details are omitted. ■ 
Let R{D) denote the rate-distortion function, i.e., 

R{D)= min I{X;X). 

p^^^:E[d{X,X)]<D 

Now we proceed to study the minimum achievable -D{2} in the scenario where -Ri = R{D^iy) and i?i + -R2 = 
2){i,2}). Define 

fil + R2 = H(Hl .-D^ij ,£){i_2}) 

Though D'^^y{D j, D ^12}) is in principle computable using Lemma 3, the calculation is often non-trivial due to 
the convex hull operation in the definition of the EGC region. We shall show that D^j} (£*{!}) £'{1,2}) has a more 
explicit characterization under certain technical conditions. 

We need the following definition of weak independence from [2]. 

Definition 2: For jointly distributed random variables U and V, U is weakly independent of V if the rows of 
the stochastic matrix [pt/|y(^i|^)] are Unearly dependent. 
The following lemma can be found in [2]. 

Lemma 4: For jointly distributed random variables U and V, there exists a random variable W satisfying 
1) U — V — W form a Markov chain; 
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2) U and W are independent; 

3) V and W are not independent; 

if and only if U is weakly independent of V . 

Theorem 4: If X is not weakly independent of -''^{i} for any -'^{i} induced by px^iy\x that achieves i?(-D{i}), 
then 

£)^2}(^{i},£'{i,2})= min E[d(X,5i(X{2}))], (10) 
where the minimization is over PX{i}X{2}|x. ffi. and g2 subject to the constraints 

/(^{i};^{2}) = 0, 

I{X-X{,}) = R{D{,}), 

I{X-X{^},X{2}) = R{R{D[,y),D[iy,D[,^2}), 

E[d(X,X{ij)] <D{i}, 

E[d(X,32(X{i},X{2}))] <I?{1,2}. 

Here one can assume that X^2} is defined on a finite set with cardinahty no greater than \X\'^ — \X\. 

Proof: First we shall show that the right-hand side of (10) is achievable. Given any -D{i} and £^{1,2} for 
which there exist auxihary random variables X/c, C /C C {1,2}, jointly distributed with X, and a function g2 
such that 

/(^{i};^{2}) = 0, 

RiD^,y) = I{X;X^,y), 

R{R{D[iy),D^iy,D[i2}) = -'^(-'^;-'^{l})-'^{2}), 

>E[d(X,X{i})], 

D{1,2} >E[d(X,ff2(X{l},^{2}))], 

we have 

R{R{D[iy), -D{i}, D[i 2}) = X^iy, X{2}) + -'^(^{i}, ^{2}), 
i?(i?(Z){i}),i?{ij,i?{i,2})-i?P{i})-/(^{i},^;^{2}) >/(^;^{2}). 
Therefore, the quintuple (iJi, ii2, -D{i}, £>{2}, -D{i,2}). where 

Ri = R{D[iy), 

R2 = i?{i,2}) - i?(i^{i}), 

D^2y=E[diX,g,iX^2}))], 
is contained in the EGC* region for any function gi. This proves the achievabUity part. 
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Now we proceed to prove the converse part. Let i?i = R{D^iy) and R2 = -R(i?(-D{i}), -D{i}, D^i 2}) — 
Since the VKG region includes the EGC region, Lemma 3 implies that the VKG region is also tight when the total 
rate is equal to i?(i?(-D{i}), £){!}, Z){i^2})- Therefore, if the quintuple (i?i, i?2, ^{2}, -D{i,2}) is achievable, 
then there exist auxiliary random variables X/c, IC C {1,2}, jointly distributed with X such that 

Rk>I{X;X^,X^k}), fee {1,2}, 

Ri+ R2> 2I{X;X^) + I{X;X{^^^,X{2},X{i^2}\X%) + /(X^ij; X{2}|^0) 

Dk>W{X,X^% 0C/CC{1,2}. 
By the definition of R{Ds^^) and _R(_R(L)|i}), Dji}., Dji 2}), we must have 
R{D{^}) = I{X; Xg, = I{X; X^,y), 

£>{i_2}) = 2/(X;X0) +7(X;X{i-}.,X{2},-^{i,2}|^0) + -'^(^{i}; ^{2}|^0) = -'^(^; -'^{i}> -^{i,2})> 

which implies that 

1) X and X0 are independent; 

2) X — X^iy — X0 form a Markov chain; 

3) -'^{i} — X^ — X^2} form a Markov chain; 

4) X — 2}) — i^H), ^{2}) form a Markov chain; 

5) PX(ij|x achieves 

Since X is not weakly independent of X^ij, it follows from Lemma 4 that X^ and X^ij are independent, which 
further implies that X^iy and X^2} are independent. By Lemma I, there exist a random variable Z one Z with 
|Z| < - 1 and a function / such that 

1) Z is independent of {X^iy, X[2})'^ 

2) -^{1,2} = /(-'^{i}7-'^{2}, 

3) X — (X{i}, X{2}, X{i_2}) - Z form a Markov chain. 
By setting X'^^^ = (X{2}, Z), it is easy to verify that 

^(-^{i};^{2}) = 0, 

R{R{D[iy),D^iy,D[i2}) = -'^(-'^;-'^{l})-'^{l,2})i 

D[2}>E[d{X,g,{X[,y)), 

D|i,2} >E[d(X,52(X{i},X{2|))], 

where 51(^(2}) = 9i{^{2},Z) = X{2} and 32 (^{1}, ^{2}) = /(-'^{i})-'^{2}, = -'^{1,2}- The proof is complete. 

■ 

Now we give an example for which £'|2}(-C'{i}, -D{i,2}) can be calculated expUcitly. 
Theorem 5: For a binary symmetric source with Hamming distortion measure, 

D{2}{D{1}, -D{1,2}) = 2 + ^{1,2} - ^{1} 
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for < D{i,2} < < i 

Proof: The proof is given in Appendix VI. ■ 

V. Concluding Remarks 

We have established a random variable substitution lemma and used it to clarify the relationship among several 
existing achievable rate-distortion regions for multiple description coding. 

Like many other ideas in information theory, our random variable substitution lemma finds its seeds in Shan- 
non's pioneering work. Consider a finite-state channel PY\xSy where the state process {St}'^i is stationary and 
memoryless. It is well known that the capacity is given by 

C = max/(X;y|S') 
Px\s 

when the state process is available at both the transmitter and the receiver. By Lemma 1, for any (X, Y, S), there 
exist a random variable Z on Z and a function f : Z x S ^ X such that 

1) Z is independent of S; 

2) X = f{S,Z)- 

3) F - {X, S)-Z form a Markov chain. 
Therefore, we have 

C = max/(X;r|S') 
Px\s 

= max I{Z-Y\S). (11) 

Note that (11) is in fact Shannon's capacity formula with channel state information at the transmitter [17] applied 
to the case where the channel state information is also available at the receiver; in this setting, f{Z, ■) is sometimes 
referred to as Shannon's strategy. 

Appendix I 
Proof of Lemma 1 

Let F be a random variable independent of V and uniformly distributed over [0, 1]. It is obvious that for each 
u e V we can find a function /„ satisfying 

nfv{y) = w)= Pw\v{w\v), w&yv. 

Now define a function / such that 

f{v,y) = Uy), V,2/G [0,1]. 

It is clear that 

¥{V = v,f{V,Y)=w)=pvw{v,w), veV,weW. (12) 
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Note that 

P(y = V, f{V, Y) = w)= = V, f{V, Y) = w\Y)], v€V,w€yV. 

It can be shown by invoking the support lemma [4] that there exist a finite set Z c [0, 1] with \Z\ < |V||W| — 1 
and a random variable Z on Z, independent of V, such that 

¥{V = V, f{V, Z)=w)= E[¥{V = V, f{V, Z) = w\Z)] 
= E[F{V = V, f{V, Y) = w\Y)] 

= F{V = v,f{V,Y)=w), veV,weW. (13) 

By (12) and (13), we can see that pvw is preserved if W is set to be equal to f{V, Z). Now we incorporate U 
into the probabiUty space by setting Pu\vwz = Pu\vw- It can be readily verified that puvw is preserved and 
U — {V, W) — Z indeed form a Markov chain. The proof is complete. 

Appendix II 
Proof of Lemma 2 

By the definition of contra-polymatroid [5], it suffices to show that the set function ^ : 2-^*^ — > M+ satisfies 1) 
= (normaUzed), 2) ^(5) < ij;{T) if 5 C T (nondecreasing), 3) '4}{S) + ^jj{T) < tp{S U T) + V(5 n T) 
(supermodular). 

1) NormaUzed: We have 

^(0) = -I{X; X0) - H{X^\X) + H{X^) = 0. 

2) Nondecreasing: If <S C T, then 

V'(T) - ^{S) = {\T\- \S\)I{X; X^) - H{X(^r) \X) + H{X^^s) \X) + ^ H{X^\X^^,A_{^})) 

> -i7(X(2T_2S)|X, X(2S)) + ^ iI(X^|X(2A_{^}.)) 

in 

>-X] X] H{^A\X,{XB}8e2-^,\B\<>') + X] Hi^A\X{2-^-{A})) 

k=l Ae2'^ -2S ,\A\=k Ae2'^-2S 

\r\ 

>-E E ^(^^i^(2^-M}))+ E ^(^^i^(2^-M})) 

k=l Ae2'^ -2^ ,\A\=k Ae2'^-2S 

= 0. 
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3) Supermodular: We have 

u T) - - (V>(5) - V(5 n T)) 

- (|»S| - |»SnT|)/(X;X0) + iJ(X(25_25nr)|X, X(2SnT)) - ^ iJ(X^|X(2yl_{^})) 
= -if(X(2SuT_2T)|X, X(2r)) +ff(X(2S_2SnT)|X, X(2Snr)) + ^ ff (X^|X(2yl_{^})) 

^£25UT__A4 

> -iI(X(2Sui-__^)|X,X(;K)) + ^ F(X^|X(2^_{X})) 

^£25UT__A4 

|SUT| 
\SUT\ 

>-E E HiXA\X^2^-iA}))+ E ^^(^^l^(2^-M})) 

fe=l ^e2Sui-_^j^|=fe Ae2^"'^-M 

= 0, 

where = 2*^ U 2-^. The proof is complete. 

Appendix III 
Proof of Theorem 2 

It is clear that T^Pvkg* ^ 7?.I?vkg- Therefore, we just need to show that T^-Pykg Q TI'C'vkg*- 

In view of Lemma 2 and the property of contra-polymatroid [5], for fixed Pxx^^x^^ and ipjc, C C I/^, the 

region specified by (4) and (5) has L\ vertices: (-Ri(7r),-- - ,Rl{'k),Dic{t^),% C /C C Xl) is a vertex for each 

permutation tt on li, where 

^.(i)W = V(Mi)}), 

Kik) W = ^({7r(l), ■ • • , 7r(fc)}) - V(M1), ■ • • , 7r(fc - 1)}), € - {1}, 

DkW =E[d(X,0K(^(2^)))], 0C/CCJi. 

Since the VKG* region is a convex set, it suffices to show that these L\ vertices are contained in the VKG* region. 
Without loss of generaUty, we shall assume that 7r(A;) = fc, fc G Xj^. In this case, we have 

i?i(7r)=V(XL)-V(XL-l). 

Now we proceed to write Rl{t^) as a sum of certain mutual information quantities. Define 

= {A: A & C,\A\ = k,L e A) 
S2{k) = {A:A€C,\A\<k,LG A}. 
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Note that 



i?i(7r)=7(X;X0) + iJ(X(2x,_,)|X)-JT(X(2x,)|X) + ^ J2 H{Xa\X^2^-{a})) 

k=l AeSiik) 
L 

= I{X;X0) - H{X^2^t,^\X,X^^xi^_i^) + H{X[Ly\X0) + ^ ^ 7J(X^|X(2yi_{^})) 

fc=2 AeSi(k) 

= I{X;X0) + J(X;X{i}|X^2^j:,_ij) + I{X^2Xi^_iy X^^ylX^) - H{X(^2'^^\X, X^2Xj^_j^y X^Ly) 

L 

fe=2 ^e5i(fe) 

L r 

fe=2 ^e5i(/c) 

We arrange the sets in Si{k) in some arbitrary order and denote them by Sk,i, ■ ■ ■ ,5^ jv(fe)' respectively, where 

N{k) = (^) - {^-^). Then for each k, 

X] ^i^A\X(2-^-{A})) - H{X(Si{k))\X,X^2^i^_iyX(S2(k))) 
AeSi(k) 

N{k) 

= [-^(^S;,,J^(25;=,i_{5^ J)) -77(Xs^_JX,X(2X^_i),X(s,(fe)),X(^5^_.^i-i)) 

N{k) 



- X] ^(''^'^(2^^-i)'^(S2(fe)).^({5fe,,}Jzi);^S,.J^(25fc,i_{5^ .J 



=1 
Ar(fe) 



N{k) 



Yl ^(''^(2^^-i)'^(S2(fc))'%Sfc.,}j-l);^5fe,J^(2^fe,i-{5fe,i})) + XI ^(^;^'Sfe,il^(2^^'-i)^(S2(fc))^({5fe,,}i.zi)) 



Ar(fe) 



X ^(''^(2^^-i)'^(S2(fe))>%Sfc,,K.-l);^5fe,J^(2^fe,i-{5fe,a)) +-''(^;^(Si(fe))l^^^^ 



Therefore, we have 



Rl{tt) = I{X;X^) + I{X;X^Ly\X^2'^L-i)) + I{X(^2'^L-i);X[Ly\X0) 



fe=2 



Af(fe) 



X -'^(^'5fc,.;^({Sfc,,Kzl)^(2^i'-i)^(52(fe))l^(2'5fc.i-{5fe,a)) +-'^(^;^(5i(fe))l^(2^i 



i=l 



= -^(-'^;-'^0) + I{X',^(S2(L)),Xx^\X,^i^_i.) + 7(X.2Xi,_i);X|i}|X0) 



+ E 

k=2 



N{k) 



E -'^(^(2^i'-i ) ' ^(52 (fe)) . ^({5fc,,- ) ; ^Sk.i \X(^2^k,i _{s, 



(14) 



It follows from Lemma 1 that there exist an auxiUary random variables Z and a function / such that 

1) Z is independent of (X(2^i'-i)'^(52(i))); 

2) -^li = /(-'^(2^i:.-i)>^(S2(L))>-^); 
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1 X^, 

-^{1} ^{2) -^^{3} 

fc = i ! o o o 

■Xll.2] Xf,-.f 

*=2i o o o 



o o o 




1.1 

1 .1 V ^ 



Fig. 2. The Structure of auxiliary random variables for the VKG region. 



3) X — {X^^ii^_iy X(^S2{L)), Xxj^) — Z form a Markov chain. 
Therefore, we have 



I{^(2'^L-i);X^Ly\X^) = I{X^^i^_-,y,X{^,Z\X0), 



and 



I{X^^x^_,yX^S2{k)),X({Sk,}^Z\rXSkJ^(2''''.'-{S,^,})) 

for 1 < i < N{k) and2 < fc < Now it can be easily verified that (i?i(7r), •• • ,Rl{tt),D,c{'^),9 C /C C Xj,) is 
preserved if we substitute -^{l} with (-'^{z,} , Z), set Xx^ to be a constant, and modify (^^a:. C /C C Xj,, accordingly. 
By the definition of the VKG* region, it is clear that (i?i(7r), • • • , Rl{t^), Dk.{tt), C /C C Xj,) e 7?.©vkg*- The 
proof is complete. 

Appendix IV 
Proof of Theorem 3 

Let R\ = ip{{l}), and = ip{Ik)—ip{Xk-i), k G — By Lemma 2 and the property of contra-polymatroid 
[5], {R\, ■ ■ ■ , is a vertex of the rate region {(i?i, • • • , Rl) '■ Rk > i>{K^), C /C C Xj,}; moreover, we have 

L 

min ctfei?;; 

(ill,- ,RL)eTlm-YKo{DK:,ICeHL) 



mm 



Y^akRl, (15) 
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where the miniinization in (15) is over Px0X{i} X{i}|x^ (p^., K S Hl, subject to the constraints 

DK>nd{X,,l>K{X(^K)))], IC&Hl. 

It follows from Theorem 2 that can be eliminated. Inspecting (14) reveals that the same method can be used 
to eliminate Xfz, K, G S2{L) — {L}, successively in the reverse order (i.e., the bottom-to-top and right-to-left order 
in Fig.2). For k from L — 1 to 2, we write i?^ in a form analogous to (14) and execute this elimination procedure. 
In this way all the auxiliary random variables, except X0,X^iy, ■ ■ ■ are eliminated. It can be verified that 

the resulting expression for (ii*, • • • , i?^) is 

The proof is complete. 

Appendix V 
Proof of Corollary 3 

First we shall show that (6) is greater than or equal to (7). Let X'^j^y = (j)^i.y{X0, X^i-y), k G II, and X^^ = 
(j)x^. (^0, Xs^iy, - ■ ■ , fc G Xl — {!}• It can be verified that 

L 

Y^aMX-.X^) + I{X,{X{^}1zl-X{ky\X0)] 
fe=i 

L 



= ^a,[/(X;X0) + 7({X{,j}^rii;X{fe}|X0) + /(X;X{fc}|X0,{X{,}}^rii)] 
fe=i 

L L 

fe=2 fc=l 
L L 

> Yak[I{X; Xii) + I{X[-^^_^y, X[f,y\X0)] + ^(a^ - ak+i)I{X; X^, X[-^^-^) 

fc=2 fc=l 
L 

= ^ ak[I{X; Xdi) + I{Xlj^^_^y,X^f.y \Xq) - I{X; X^, X[y^^_^^) + I{X\ Xg, X[y^^^)\ 



fe=i 

L 



= X] ^0) + H^ink-iY'^k} 1^0) + ^(^; ^{k}^^k 1^0' ^(Wfc-i))]' 

fe=i 

where ul+i — 0. 

Now we proceed to show that (7) is greater than or equal to (6). It follows from Lenama 1 that there exist a 
random variable Z and a function / such that 

1) Z is independent of {X^, X(^ji^_^'), X^j^y); 

2) Xx^ = f{X0,X(^H^_^^,X^Ly,Zy, 

3) X — (X0, Xc^^)) — Z form a Markov chain. 
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Note that 

I{X; X{i^},Xxl \Xfij, = I{X; X^i^y, Z\X0, X(^Hl-i))- 

Therefore, we can substitute X{l} with {X^Ly,Z) and eUminate Xj^. It is clear that one can successively eliminate 
• • ■ , Xx2 in a similar manner. The proof is complete. 

Appendix VI 
Proof of Theorem 5 

It is obvious that £'|2}(£'{i}j ^{1,2}) = ^{1,2} if -^{1} = Therefore, we shall only consider the case 
^{1} < h 

Since binary symmetric sources are successively refinable, it follows that 

R{D[,y) = l-Hb{D[,y), 

(£>{!}), £>{!}, £){i,2}) = 1 - ^?6(£'{1,2}), 

where Hb[-) is the binary entropy function. If D{i} < \, then R{Ds^l^^) is achieved if and only if Px^ijix is a 
binary symmetric channel with crossover probability £*{!}; it is clear that X is not weakly independent with the 
resulting -'^{i}. Therefore, Theorem 4 is applicable here. 

Define -'^{1,2} = 52(-^{i}, -^^{2})- Note that we must have E[d(X, X{i.2})] < -D{i,2} and 

7(X;X{i},X{2}) = /(X;X{i}.,X{2},-'^{i,2}) 
= 7(X;X{i,2}) 

= 1 - 7f6(D{i,2}), 

which implies that X — -^^^{1.2} — (-'^{iji ^{2}) form a Markov chain and px^i 2}|x i^ binary symmetric channel 
with crossover probability -D{i_2}- Therefore, pxx^^^yX^^^ 2} i^ completely specified by the backward test channels 
shown in Fig. 3. Now it is clear that one can obtain D*{D^^,D^i 2}) by solving the following optimization 
problem 

D*(£){i},£){i,2})= min ¥.[d{X,gi{X{2}))] 

subject to the constraints 

1) -'^{i} and X^2} are independent; 

2) 2} is a deterministic function of ^{i} and 2}; 

3) X — -'^{1,2} - (-'^{1}, -^{2}) form a Markov chain. 

Assume that X{2} takes values in {0, 1, • • • , n — 1} for some finite n. We tabulate PxX{i}X|2}^{i,2}' P^{i}^{2}' 
Pxx^iy and PX{ijXp}X|i,2} for ease of reading. 
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x./-r{iy.V{i:2] ""^ ^^^^ 





1 


2 




n — 1 


0,0,0 


"0,0 


"0.1 


"0.2 




"0.;i-l 


0,0,1 


ai,o 


ai,i 


Ol,2 




Ol,n-l 


0,1,0 


02,0 


02,1 


02,2 




02,n-l 














1,1,1 


07,0 


07,1 


07,2 




07,rt-l 



X{1} ^\ 







n-l 





oo,o + 01,0 + 04,0 + 05,0 




O0,n-l + Ol,„_l + 04, n-l + 05,„_1 


1 


02,0 + O3_o + 06,0 + 07,0 




02,n-l + 03,„_1 + 06,n-l + 07,„_i 




X 







n-l 





Oo,0 + Oi,o + 02,0 + 03,0 




Oo,n-1 + Oi_„_i + a2,n-l + 03,„_1 


1 


04,0 + 05,0 + 06,0 + 07,0 




04,n-l + 05,n_i + 06, n-l + 07,„_1 



^{1,2} 


0,0 




0,n- 1 


1,0 




l,n- 1 





oo.o + 04,0 




Oo,ri-1 + 04, n-l 


02,0 + 06,0 




02,n-l + a6,n-l 


1 


ai,o + 05,0 




Ol,n-l + 05, n-l 


03,0 + 07,0 




03,n-l + 07,n-l 



According to pxXs.xXs^ o\ (cf- Fig- 3), it is easy to see that 



5^a7,, = -(l-p)(l-D{i,2}), 

n— 1 ^ 



n-l 



y^oo,i 

j=0 
n-l 



j=0 
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n-1 



a2,i = ^ a5,^ = - I?{1,2}), 

i=0 1=0 

n— 1 n— 1 ^ 

Ya3^^^^a4,z^ -il~p)D{i^2}- (16) 



2 

i=0 i=0 

Furthermore, one can verify the following statements. 

1) The fact that X^iy and X{2} are independent and that X^^iy is uniformly distributed over {0, 1} impUes 

ao,i + + a4,i + a5,i = a2,i + a3,i + a6,i + ay.i, i = 0, (17) 

2) The fact that X^i^2} is a deterministic function of {X^iy, X^2}) implies 

{ao,i + a4,i)(ai,i + as.i) = (a2,i + a6,i)(a3,i + «7,i) =0, i = 0, • • • , n - 1. (18) 

3) The fact that X — -^^{1,2} ~ (-'^{i}) -^^{2}) form a Markov chain implies 

1 - -D{1,2} 1 - ^{1,2} 

fllO.i — j:^ 04,1, Cl5,i — 7^ 

-^{1,2} -^^{1,2} 

1 ~ 2} 1 — -^{1 2} 

a2,i = — 7^ —ae,i, a7,i = — r. ' "3,», i = 0,---,n-l. (19) 

^{1,2} -^{1,2} 

According to (18), there are four possibilities for each i: 

O'0,i = 0,2,1 = Cl4,i = «6,i = 0, 

or ao,i = a3,i = 04,^ = 07,^ = 0, 
or ai,i = a2,i = 05,1 = (i6,i = 0, 
or oi^j = aa^j = a^^i = a^^i =0, z = 0, • • • , n — 1. 
Moreover, in view of (17), we can partition {0, 1, • • • , n — 1} into four disjoint sets Sj, j = 1, 2, 3, 4, such that 

+ a5,i = as.i + 07,1, i & Si 

dl.j + a5,i = 02,1 + 06,1, i G <S2 

ao,i + a4.i = 0.3,4 + a7,i, i G 

ao,i + ai^i = a2,i + a6,i, i G S4. (20) 

Combining (19) and (20) yields 

= 0,3,1, o^^i = ar,i, i € Si 

Ol,i = CJ6,i, 02,i = CJ5,i, i € S2 

oo,i = 07,i, a3,i = 04, t, i G (Sa 

O0,i = 02,i, Oi^i = 06, i, i G S4. 

It is easy to see that different values in each Sj, j = 1, 2, 3, 4, can be combined. That is to say, we can assume 
that ^{2} takes values in {0, 1, 2, 3} with no loss of generaUty. As a consequence, Pxx^iyX^2}Xii 2} ^^'^ Pxx^^y 
can be re-tabulated as foUows. 
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1 


2 


3 


0,0,0 










/34 


0,0,1 


ai 


a2 








0,1,0 










/34 


0,1,1 


ai 





as 





1,0,0 








as 


0:4 


1,0,1 


Pi 


P2 








1,1,0 










a4 


1,1,1 


/3i 





/33 






a; 





1 


2 


3 







q;2 + /32 


0:3 + /33 


2/34 


1 


2/3i 


OLl + /32 


as + /33 


2a4 



Note that and /3, satisfy 

A = — ^^-^i^ai, i = 1,2,3,4, 

^{1,2} 

ai + Q;2 = 0:4 + a2 = 2^'-^{i,2}) 
ai + as = ^(1 -P)£'{i,2}, 

where the first four equahties follow (19) while the others follow (16). Using -^^{2} to reconstruct X, one can 
achieve 

Z){2} = 2ai + a2 + /32 + as + /^s + 2a4 

= - - - ai + /34 - a4) 

1 l-2D{i,2} l-2i?|i,2} 

= -z — -Oi\ —^a^. 

2 -^^{1,2} ^{1,2} 

It can be easily verified that -D{2} is minimized when ai = a4 = \'pDs^x.2\- Therefore, we have 

^{2}(^{1},^{1,2}) = 2p£'{i,2} + ^(1 - 2p) 
= ^+^{1,2} -£>{!} • 
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